A Bag-of-phonemes Model for Homeplace Classification of Mandarin Speakers
نویسندگان
چکیده
Mandarin, also known as Standard Chinese is the official language of China and Singapore, there are certain differences when mandarin is spoken by people from different homeplaces. The homeplace classification is important in speech recognition and machine translation. In this paper, we proposed a novel model named Bag-of-phonemes (BOP) for homeplace classification of mandarin speakers, which follows the conceptually similar idea of the Bag-of-words (BOW) model in text processing. The low-level Mel-frequency cepstral coefficients (MFCC) speach features of each homeplace are clustered into a set of codewords referred to as phonemes. With this codebook, each speech signal can be represented by a feature vector of distribution on phonemes. Classical classifiers such as support vector machine (SVM) can be applied for classification. This model is tested by RASC863 database, empirical studies show that the new model has a better performance on the RASC863 database comparing to previous works [1].
منابع مشابه
The Role of Phoneme in Mandarin Chinese Production: Evidence from ERPs
Established linguistic theoretical frameworks propose that alphabetic language speakers use phonemes as phonological encoding units during speech production whereas Mandarin Chinese speakers use syllables. This framework was challenged by recent neural evidence of facilitation induced by overlapping initial phonemes, raising the possibility that phonemes also contribute to the phonological enco...
متن کاملPalarimetric Synthetic Aperture Radar Image Classification using Bag of Visual Words Algorithm
Land cover is defined as the physical material of the surface of the earth, including different vegetation covers, bare soil, water surface, various urban areas, etc. Land cover and its changes are very important and influential on the Earth and life of living organisms, especially human beings. Land cover change monitoring is important for protecting the ecosystem, forests, farmland, open spac...
متن کاملMandarin-English Bilinguals Process Lexical Tones in Newly Learned Words in Accordance with the Language Context
Previous research has mainly considered the impact of tone-language experience on ability to discriminate linguistic pitch, but proficient bilingual listening requires differential processing of sound variation in each language context. Here, we ask whether Mandarin-English bilinguals, for whom pitch indicates word distinctions in one language but not the other, can process pitch differently in...
متن کاملA contrastive investigation of standard Mandarin and accented Mandarin
Segmental and supra-segmental acoustic features between standard and Shanghai-accented Mandarin were analyzed in the paper. The Shanghai Accented Mandarin was first classified into three categories as light, middle and heavy, by statistical method and dialectologist with subjective criteria. Investigation to initials, finals and tones were then carried out. The results show that Shanghainese al...
متن کامل